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Accelerating Fermionic Molecular Dynamics 

M. A. Clark^ and A. D. Kennedy'" 

'"School of Physics, The University of Edinburgh, 
Edinburgh EH9 3JZ, United Kingdom 

We consider how to accelerate fermionic molecular dynamics algorithms by introducing n pseudofermion fields 
coupled with the nth root of the fermionic kernel. This reduces the maximum pseudofermionic force, and 
thus allows a larger molecular dynamics integration step size without hitting an instability in the integrator. 



1. Introduction 

For over fifteen years the algorithm of choice for 
generating lattice field theory configurations in- 
cluding the dynamical effect of fermions has been 
Hybrid Monte Carlo (HMC) 1 . Unfortunately 
the cost of this algorithm increases rapidly as the 
fermion mass m decreases; in order to keep the 
HMC acceptance rate P^cc constant the molecu- 
lar dynamics integration step size St has to be 
reduced, and for molecular dynamics trajectories 
of length T — 1 this corresponds directly to an 
increased number of molecular dynamics integra- 
tion steps and hence larger cost. 

This required decrease in step size is because 
of the breakdown of symmetric symplectic inte- 
grators. For light dynamical fermions there is an 
instability for a few isolated light fermion modes, 
whose frequency is well separated from the bulk 
of the modes. This instability is seen to be di- 
rectly responsible for the exponential decrease of 
HMC acceptance with integration step size above 
some critical value. 

We introduce a method for reducing the sever- 
ity of this problem by reducing the highest "ef- 
fective frequency" of the fermionic modes, or 
equivalently of decreasing the magnitude of the 
fermionic contribution to the force acting on the 
gauge fields. The basic idea follows the sugges- 
tion of Hasenbusch 2 to split the fermionic action 
into two parts, and to introduce separate pseudo- 
fermion fields for each part. Our approach can 
be easily generalised to an arbitrary number of 
pseudofermion fields.^ 



2. Non-linearity of CG 

We observe that the force due to the fermion 
kernel Ai^^ \s dominated by the smallest eigen- 
values of M.. The condition number k{M.) is 
the ratio of the largest eigenvalue to the smallest 
eigenvalue, and to a first approximation controls 
the rate of convergence of iterative Krylov space 
solvers. The largest eigenvalue remains approx- 
imately constant as the fermion mass m is de- 
creased, and the smallest eigenvalue is typically 
of the order to" where a is 1 or 2, so we expect 
k(A^) cx mT" . 

Consider the numerical solution of the linear 
system ^Ax — ^^ where > and has condition 
number k{M.). The cost of solving these linear 
equations is proportional to m~". On the other 
hand we could equivalently solve the set of cou- 
pled linear equations V A4x = 4' ^J^d \/ Aitp ~ 4>, 
each of which has condition number k[\/M) — 
\J leading to a cost of order 2m~°'/'^ in 
this case, which is cheaper for sufficiently small to. 
This reflects the essential non-linearity of Krylov 
space solvers. Indeed, we may even be more ad- 
venturous and solve the set of n coupled systems 
\/jAipj = ^j+i, where ipo = X ^iid = f^, for 
which we have to perform n solves each with con- 
dition number — leading to a 
total cost of order nK(Al)^/". 

Unfortunately we cannot take advantage of 
this non-linearity, the problem being that it is 



^Hasenbusch's method also allows an arbitrary number of 



pseudofermion fields to be used, but the parameters in the 
action have to be tuned to ensure that the contributions 
to the force are divided up equally. 

^For M > the positive square root is uniquely defined. 
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not straightforward to apply VaT to a vector 
when M is not serendipitously a manifest square. 
There are efheient techniques for evaluating ma- 
trix functions, such as computing the optimal 
polynomial or rational Chebyshev approximation 
to the function over the spectrum of the matrix 
[^. For the rational case the approximations may 
be found using the Remez algorithm, and they 
usually converge exponentially in the degree of 
the rational function. In practice only a rela- 
tively low degree rational function is needed to 
achieve machine floating-point precision . Fur- 
thermore, if we take a rational approximation 
then we can express it as a partial fraction, and 
apply all the terms simultaneously (in the same 
Krylov space) using a multi-shift solver. This re- 
duces the cost of solving Ai^^^^x = b to about 
the same cost as solving Mx = b. This in turn 
is expected to be proportional to the condition 
number k{M). Sadly, this means that the cost 
of the proposed method is of order nK{Ai) rather 
than rtK(A^)^/", and we are worse off than when 
we started. 

Although this method is clearly useless to accel- 
erate the convergence of Krylov space solvers, it 
still does significantly reduce the condition num- 
ber of each of the n solves, and this is what we 
shall make use of to decrease the pseudofermionic 
force. Indeed, we expect the force to be of or- 
der nK(A^)^/"5T, which is small compared to 
k{M.)6t for large k{A4). A naive calculation min- 
imising nK{A4y^'^/K{M) leads to the conclusion 
that the optimal number of pseudofermion fields 
should be^ n^pt — \nK{M). 

3. Pseudofermion Sampling 

Recall that we represent the fermion determi- 
nant as a pseudofermion Gaussian functional in- 
tegral, detM (X / d(f>d(j)^ exp (^—(f)^ M~^(j)) , and 
then select a single equilibrium pseudofermion 
configuration using a Gaussian heatbath. We ex- 
pect, therefore, that the variance of this stochas- 
tic estimate of the fermion determinant will lead 
to statistical fiuctuations in the fermionic force: 
in other words the pseudofermionic force may be 

^Strictly speaking n^pt 6 so it must be either LlnK(jVf)J 
or lln k(M)~\ . 



larger than the exact fermionic force, which is 
the functional derivative dtr\nA4{U)/dU with 
respect to the gauge field U. This means that 
the pseudofermionic force may trigger the insta- 
bility in the symplectic integrator even though 
the exact fermionic force would not. 

An obvious way of ameliorating this effect is to 
use n > 1 pseudofermion fields (which we shall 
call muUipseudofermions) to sample the func- 
tional integral representing the fermion determi- 
nant, and this is achieved simply by writing 

detM - [detM^/"]" 

n 

(X Y[ ^'^J d<l)] exp ; 

that is, introducing n pseudofermion fields 
each with kernel Ai"^^". 

We may now follow a similar argument to that 
in fJ21 to estimate the optimal value for n. We 
must keep the maximum force fixed so as to avoid 
the instability in the integrator, so we may in- 
crease the integration step size to St' such that 
nK(A^)"'^/"5r' = k{M.)St. At constant trajectory 
length, and hence constant autocorrelation time, 
the cost of an HMC trajectory is proportional to 
the step size, and thus is minimised by choos- 
ing n so as to minimise St'/St = K{Aiy~^^'^ /n, 
which leads to the condition n^pt « Ihk{A4), 
corresponding to cost reduction by a factor of 
St/St' « e In k(7W)/k(7W). 

4. RHMC Acceleration 

Our method is to apply the RHMC algorithm 
^ to generate gauge field and pseudofermion con- 
figurations distributed according to the probabil- 
ity density 

P(C/, (/.I, . . . , 0„) = I exp [-Sb{U) - Sf{M)] , 

where Sb is the bosonic (pure gauge) action and 
SpiM) = E"=i<^]-^"^^'Vj- Optimal rational 
approximations are used to evaluate these matrix 
functions, and we proceed as we would for con- 
ventional HMC PP . Using a multi-shift solver the 
computational cost is very similar to HMC, with 
the additional overhead of having to perform a 
matrix inversion to evaluate the heatbath. 
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Figure 1. Efficiency £^ as a function of integration 
step size 6t, for QCD with 4 flavours of naive 
staggered quarks of mass m = 0.025 at /? = 5.26 
on an 8^ lattice with trajectory length t — 1. 



We have performed tests of the algorithm us- 
ing small lattices with four flavours of nai've stag- 
gered fermions. As well as being the least com- 
putationally demanding simulations to perform, 
the improvement for such systems should be a 
lower bound on the improvement for large volume 
and/or Wilson type fermion simulations, where 
the fermion matrix is less well-conditioned. 

The n = 1 case runs were performed using con- 
ventional HMC, and the n > 1 runs using RHMC. 
We define an efficiency measure E = (Pacc)/A^inv, 
where 7Vi„v is number of Krylov solves performed 
per trajectory. For HMC this is given by 1 + t/St 
and for RHMC by (2 -I- T/ST)n, where r is the 
trajectory length and St the step size. The ad- 
ditional inversion per field for RHMC is for the 
pseudofcrmion heatbath. 

Figure n is a plot of the efficiency against step 
size for various value of n. The conjecture that 
there is some optimal value of n is confirmed, and 
it can be seen that n = 2 is the value for this 
particular lattice. It represents an increase in ef- 
ficiency of 33% {n = 2 peak divided by n = 1 
peak). We have conducted the same test with 



a mass parameter of to = 0.01, which shows 
the same optimal value of n, but with a sub- 
stantial efficiency increase of 60%. This confirms 
that the improvement factor increases as less well- 
conditioned systems are studied. 

5. Conclusions 

We have demonstrated that our method for ac- 
celerating the Monte Carlo acceptance test im- 
proves the efficiency over conventional HMC. This 
is clearly still a very preliminary study, and shall 
be extended to large scale systems for both ASQ- 
TAD and domain wall simulations using QCDOC. 
On such systems, we would expect the improve- 
ment in efficiency to increase, and it will be in- 
teresting to see how much improvement can be 
gained. It would also be interesting to compare 
our method with that proposed by Hasenbusch 
in[2]. 
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